Extracting Source Models from Java Programs: Parse, Disassemble, or Profile?
نویسندگان
چکیده
Source models of software systems are often created during re-engineering to aid in performing tasks such as reachability analysis and software architecture recovery. It is therefore vital to be able to create source models that are both detailed and accurate. However, in practice the creation of these models is difficult and error prone: extraction tools often tell only part of the story. We have been working on the automated extraction of source models for programs written in Java. Our approach considers Java programs from three points of view: parsing, disassembly, and profiling. We have found that these three techniques have advantages that are complementary. Parsing source code provides the most detailed information, but it is the most complex to implement. Disassembling Java byte code gives similar results to parsing, but is less complex technically. Profiling provides the least amount of detail, but does give important feedback on run-time behaviour, such as polymorphic function calls and reflective instantiation of objects by string input. We have applied our tools to several systems, including Sun’s javac Java compiler and the Jigsaw web server. We compare the source models extracted by each of our tools, and describe reasons for differences in the extracted models.
منابع مشابه
Integrating Information Sources for Visualizing Java Programs
This paper describes the integration of information sources to support the exploration of source code and documentation of Java programs. There are many public domain tools that are available for extracting information and documentation from Java programs. We describe how data integration and presentation integration were used to enable the visualization of this information within a software ex...
متن کاملAnimating the Formalised Semantics of a Java-Like Language
Considerable effort has gone into the techniques of extracting executable code from formal specifications and animating them. We show how to apply these techniques to the large JinjaThreads formalisation. It models a substantial subset of multithreaded Java source and bytecode in Isabelle/HOL and focuses on proofs and modularity whereas code generation was of little concern in its design. Emplo...
متن کاملProgram Annotation in XML: A Parse-Tree Based Approach
In this paper we describe a technique that can be used to annotate source code with syntactic tags in XML format. This is achieved by modifying the parser generator bison to emit these tags for an arbitrary LALR grammar. We also discuss an immediate application of this technique, a portable modification of the gcc compiler, that allows for XML output for C, Objective C, C++ and Java programs. W...
متن کاملReverse Engineering: An Analysis of Dynamic Behavior of Object Oriented Programs by Extracting UML Interaction Diagram
The Unified Modeling Language (UML) is widely used as a high level object oriented specification language. UML is a good target language for the reverse engineering models since it is largely used and offers different diagrams. In this paper we present a novel approach in which reverse engineering is performed using UML as the modeling language used to achieve a representation of the implemente...
متن کاملAlgorithmic Aspects of Natural Language Processing
Examples of natural languages are Chinese, English and Italian. They are called natural as they evolved in a more or less natural way, without too many deliberate considerations. This sets them apart from formal languages, amongst which are programming languages, which are designed to allow easy processing by computer algorithms. Typically, programs in programming languages such as C or Java ca...
متن کامل